Model Selection

High-Resolution Vision

# High-Resolution Vision

Oryx-1.5-7B is a 7B-parameter model developed based on the Qwen2.5 language model, supporting a 32K token context window and specializing in efficiently processing visual inputs of arbitrary spatial dimensions and durations.

Text-to-Video Supports Multiple Languages

Sapiens Depth 1b Bfloat16

Sapiens is a vision Transformer model pre-trained on 300 million 1024x1024 resolution portrait images, focusing on human-centric vision tasks.

3D Vision English

Sapiens Depth 2b Bfloat16

Sapiens-2B is a vision Transformer model pre-trained on 300 million high-resolution human images, specifically optimized for human depth estimation tasks, supporting 1K resolution inference with excellent generalization capabilities in real-world scenarios.

3D Vision English

Sapiens Depth 2b Torchscript

Sapiens is a vision Transformer model pre-trained on 300 million 1024×1024 resolution human images, specifically designed for human-centric vision tasks with exceptional generalization capabilities.

3D Vision English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase